# AWSBedrockStream

The AWS Bedrock stream functions are utilties that transform the outputs from the [AWS Bedrock API](https://docs.aws.amazon.com/bedrock/latest/APIReference/welcome.html) into a `ReadableStream`. They use `AIStream` under the hood and handle parsing Bedrock's response.

There are currently 3 AWS Bedrock stream wrappers:

- `AWSBedrockAnthropicStream` for Anthropic models
- `AWSBedrockCohereStream` for Cohere models
- `AWSBedrockLlama2Stream` for Llama 2 models

This works with the official Bedrock API, and it's supported in both Node.js, [Edge Runtime](https://edge-runtime.vercel.app), and browser environments.

## `AWSBedrock*Stream(res: Response, cb?: AIStreamCallbacks): ReadableStream` [#AWSBedrockStream]

## Parameters

### `res: Response`

The `Response` object returned by the request to the Cohere API.

### `cb?: AIStreamCallbacks`

This optional parameter can be an object containing callback functions to handle the start, each token, and completion of the AI response. In the absence of this parameter, default behavior is implemented.

## Example: Anthropic Models

The `AWSBedrockAnthropicStream` function can be coupled with a call to an Anthropic model using the Bedrock API to generate a readable stream of the completion. This stream can then facilitate the real-time consumption of AI outputs as they're being generated.

Here's a step-by-step example of how to implement this in Next.js:

```js filename="app/api/chat/route.ts"
import {
  BedrockRuntimeClient,
  InvokeModelWithResponseStreamCommand,
} from '@aws-sdk/client-bedrock-runtime';
import { AWSBedrockAnthropicStream, StreamingTextResponse } from 'ai';
import { experimental_buildAnthropicPrompt } from 'ai/prompts';

// IMPORTANT! Set the runtime to edge
export const runtime = 'edge';

const bedrockClient = new BedrockRuntimeClient({
  region: process.env.AWS_REGION ?? 'us-east-1',
  credentials: {
    accessKeyId: process.env.AWS_ACCESS_KEY_ID ?? '',
    secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY ?? '',
  },
});

export async function POST(req: Request) {
  // Extract the `prompt` from the body of the request
  const { messages } = await req.json();

  // Ask Claude for a streaming chat completion given the prompt
  const bedrockResponse = await bedrockClient.send(
    new InvokeModelWithResponseStreamCommand({
      modelId: 'anthropic.claude-v2',
      contentType: 'application/json',
      accept: 'application/json',
      body: JSON.stringify({
        prompt: experimental_buildAnthropicPrompt(messages),
        max_tokens_to_sample: 300,
      }),
    }),
  );

  // Convert the response into a friendly text-stream
  const stream = AWSBedrockAnthropicStream(bedrockResponse);

  // Respond with the stream
  return new StreamingTextResponse(stream);
}
```

In this example, the `AWSBedrockAnthropicStream` function transforms the text generation stream from the Bedrock into a ReadableStream of parsed result. This allows clients to consume AI outputs in real-time as they're generated, instead of waiting for the complete response.

## Example: Cohere Models

The `AWSBedrockCohereStream` function can be coupled with a call to a Cohere model using the Bedrock API to generate a readable stream of the completion. This stream can then facilitate the real-time consumption of AI outputs as they're being generated.

Here's a step-by-step example of how to implement this in Next.js:

```js filename="app/api/chat/route.ts"
import {
  BedrockRuntimeClient,
  InvokeModelWithResponseStreamCommand,
} from '@aws-sdk/client-bedrock-runtime';
import { AWSBedrockCohereStream, StreamingTextResponse } from 'ai';

const bedrockClient = new BedrockRuntimeClient({
  region: process.env.AWS_REGION ?? 'us-east-1',
  credentials: {
    accessKeyId: process.env.AWS_ACCESS_KEY_ID ?? '',
    secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY ?? '',
  },
});

function buildPrompt(
  messages: { content: string, role: 'system' | 'user' | 'assistant' }[],
) {
  return (
    messages.map(({ content, role }) => {
      if (role === 'user') {
        return `Human: ${content}\n`;
      } else {
        return `Assistant: ${content}\n`;
      }
    }) + 'Assistant:\n'
  );
}

export async function POST(req: Request) {
  // Extract the `prompt` from the body of the request
  const { messages } = await req.json();

  // Ask Claude for a streaming chat completion given the prompt
  const bedrockResponse = await bedrockClient.send(
    new InvokeModelWithResponseStreamCommand({
      modelId: 'cohere.command-light-text-v14',
      contentType: 'application/json',
      accept: 'application/json',
      body: JSON.stringify({
        prompt: buildPrompt(messages),
        max_tokens: 300,
      }),
    }),
  );

  // Convert the response into a friendly text-stream
  const stream = AWSBedrockCohereStream(bedrockResponse);

  // Respond with the stream
  return new StreamingTextResponse(stream);
}
```

In this example, the `AWSBedrockCohereStream` function transforms the text generation stream from the Bedrock into a ReadableStream of parsed result. This allows clients to consume AI outputs in real-time as they're generated, instead of waiting for the complete response.

## Example: Llama 2 Models

The `AWSBedrockLlama2Stream` function can be coupled with a call to a Llama 2 model using the Bedrock API to generate a readable stream of the completion. This stream can then facilitate the real-time consumption of AI outputs as they're being generated.

Here's a step-by-step example of how to implement this in Next.js:

```js filename="app/api/chat/route.ts"
import {
  BedrockRuntimeClient,
  InvokeModelWithResponseStreamCommand,
} from '@aws-sdk/client-bedrock-runtime';
import { AWSBedrockLlama2Stream, StreamingTextResponse } from 'ai';
import { experimental_buildLlama2Prompt } from 'ai/prompts';

const bedrockClient = new BedrockRuntimeClient({
  region: process.env.AWS_REGION ?? 'us-east-1',
  credentials: {
    accessKeyId: process.env.AWS_ACCESS_KEY_ID ?? '',
    secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY ?? '',
  },
});

export async function POST(req: Request) {
  // Extract the `prompt` from the body of the request
  const { messages } = await req.json();

  // Ask Claude for a streaming chat completion given the prompt
  const bedrockResponse = await bedrockClient.send(
    new InvokeModelWithResponseStreamCommand({
      modelId: 'meta.llama2-13b-chat-v1',
      contentType: 'application/json',
      accept: 'application/json',
      body: JSON.stringify({
        prompt: experimental_buildLlama2Prompt(messages),
        max_gen_len: 300,
      }),
    }),
  );

  // Convert the response into a friendly text-stream
  const stream = AWSBedrockLlama2Stream(bedrockResponse);

  // Respond with the stream
  return new StreamingTextResponse(stream);
}
```

In this example, the `AWSBedrockLlama2Stream` function transforms the text generation stream from the Bedrock into a ReadableStream of parsed result. This allows clients to consume AI outputs in real-time as they're generated, instead of waiting for the complete response.
